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Abstract 


An  efficient  implementation  of  background  error  correlation  modeling  for  ocean  data 
assimilation  based  on  the  implicit  solution  of  a  diffusion  equation  is  presented  in  this  work.  This 
study  is  an  extension  of  Weaver  and  Courtier  (2001),  which  sought  to  model  error  correlations 
based  on  the  explicit  solution  of  a  generalized  diffusion  equation.  The  implicit  solution  is 
unconditionally  stable,  therefore  larger  time  steps  can  be  used  in  the  calculation  than  in  the 
explicit  solution,  which  needs  smaller  time  steps  to  maintain  stability.  This  is  especially  true 
when  modeling  anisotropic  correlations,  or  when  using  a  non-uniform  model  grid  (e.g. 
curvilinear  grid  spacing).  Both  implicit  and  explicit  methods  are  tested  in  terms  of  numerical 
efficiency  and  practical  implementation.  To  that  end,  a  set  of  simulated  and  real  data 
assimilation  experiments  are  carried  out  using  a  three-dimensional  variational  (3D-Var) 
algorithm  that  has  been  developed  as  a  test-bed  for  these  correlation  models.  The  results  of  both 
the  implicit  and  explicit  method  are  compared  to  show  that  while  the  implicit  method  provides 
the  same  correlation  shape,  size,  and  magnitude  as  the  explicit,  it  does  so  at  a  much  lower 
computational  cost.  For  the  experiments  shown  here  the  implicit  solution  can  be  up  to  five  times 
as  efficient  in  terms  of  CPU  time  than  the  explicit,  while  also  providing  a  nearly  identical 
analysis  and  forecast  in  terms  of  deviation  from  independent  observations. 
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1.0  INTRODUCTION 


The  specification  of  background  error  covariances  in  any  assimilation  scheme  is  one  of 
the  most  important  tasks  in  the  field  of  data  assimilation.  Until  recently,  most  data  assimilation 
schemes  have  assumed  the  structure  of  the  covariances  to  be  isotropic  and  homogeneous. 
However,  more  recent  studies  suggest  that  this  is  a  major  shortcoming  of  most  data  assimilation 
methods  (Kalnay  et  al.  1997;  Houtekamer  and  Mitchell  1998;  and  Errico  1999;  Purser  et  al. 
2003b)  as  this  assumption  restricts  the  flow  of  observational  information  to  circular  regions 
surrounding  the  measurement  location  on  the  analysis  grid.  Otte  et  al.  (2001)  points  out  that 
assuming  a  circular  influence  region  ignores  important  features  such  as  temperature  and  wind 
gradients  that  may  provide  valuable  information  as  to  the  structure  of  the  air  mass. 

Several  studies  have  been  made  to  investigate  the  construction  of  anisotropic  and 
inhomogeneous  error  correlations  on  the  analysis  grid.  Purser  et  al  (2003b)  suggest  that  it  is 
possible  to  obtain  some  measure  of  local  anisotropy  depending  on  the  geometry  of  the  chosen 
analysis  grid.  They  point  to  studies  done  by  Shapiro  and  Hastings  (1973)  and  Benjamin  (1989) 
who  perfonn  an  analysis  in  isentropic  coordinates,  which  provides  increased  vertical  resolution 
in  regions  that  exhibit  high  static  stability.  This  approach  can  be  troublesome,  however,  due  to 
the  lack  of  control  on  the  degree  of  anisotropy  as  well  as  being  limited  in  the  variety  of  shapes 
one  could  use.  Instead,  it  has  been  suggested  by  numerous  studies,  most  notably  by  Purser  et  al. 
(2003b)  and  Weaver  and  Courtier  (2001),  that  to  obtain  a  controllable  inhomogeneous  and 
anisotropic  structure,  one  must  define  an  error  correlation  operator  (as  a  component  of  the  full 
covariance)  with  the  built-in  capability  to  model  these  anisotropic  features. 

Purser  et  al.  (2003a)  introduce  a  method  to  define  a  correlation  operator  based  on 
recursive  filters.  In  this  work  the  filters  are  purely  homogeneous  and  isotropic.  They 
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demonstrate  a  method  to  extend  the  filter  algorithm  to  include  anisotropic  structures.  3D 
anisotropic  features  are  captured  by  utilizing  a  type  of  hexad  algorithm  that  applies  the  filter  in 
the  direction  of  six  nonstandard  grid  lines  to  achieve  some  form  of  “stretching”  in  the  structure 
functions.  This  covariance  application  has  been  utilized  in  several  studies,  most  notably  in  Wu 
et  al.  (2002)  and  in  Liu  et  al.  (2007)  where  it  was  applied  in  a  3D-Var  environment  in  an  effort  to 
assimilate  GPS  slant-path  water  vapor  observations. 

The  work  in  this  study  is  an  extension  of  that  done  by  Weaver  and  Courtier  (2001)  who 
aim  to  model  anisotropic  and  inhomogeneous  correlations  for  the  ocean  on  a  sphere  using  the 
diffusion  equation.  Weaver  and  Courtier  (2001)  build  on  previous  studies  done  by  Egbert  et  al 
(1994)  and  Derber  and  Rosati  (1989)  who  proposed  the  use  of  an  iterative  Laplacian  grid  point 
filter  to  build  error  correlations.  In  the  case  of  Weaver  and  Courtier  (2001)  the  Laplacian 
operator  is  interpreted  to  be  a  time-step  integration  of  a  diffusion  equation,  where  the  integral 
kernel  of  the  equation  is  the  representation  of  a  covariance  function.  Their  work  demonstrates 
the  ability  to  model  anisotropic  correlations  by  defining  the  diffusion  coefficient  as  a  function  of 
the  analysis  grid.  This  method  has  been  employed  successfully  in  numerous  studies,  including 
Weaver  et  al  (2003),  Ngodock  (2005),  Weaver  et  al  (2005),  and  Pannekoucke  and  Massart 
(2008). 

Weaver  and  Courtier  (2001)  base  their  method  on  the  explicit  solution  to  a  generalized 
diffusion  equation.  A  consequence  of  this  is  that  the  correlation  operator  is  only  conditionally 
stable.  Depending  on  the  degree  of  anisotropy  in  the  correlation  structures,  this  method  could 
require  several  hundred  or  even  thousands  of  time  stepping  iterations  to  produce  a  correlation 
field  due  to  the  CFL  stability  criterion.  Purser  et  al.  (2003b)  point  out  that  one  of  the  main 
advantages  of  their  filter  approach  is  the  relative  speed  in  comparison  to  the  diffusion  method. 
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The  work  presented  in  here  illustrates  the  use  of  the  implicit  solution  to  a  diffusion 
equation  to  increase  the  computational  efficiency  of  the  Weaver  and  Courtier  (2001)  algorithm. 
In  the  next  section  a  detailed  examination  of  the  implicit-solution  scheme  is  presented  as  well  as 
a  general  comparison  between  the  correlation  shapes  produced  from  the  implicit  and  explicit 
solutions  of  the  diffusion  equation.  Section  three  then  outlines  the  set-up  and  results  for  both  the 
simulated  and  the  real  data  assimilation  experiments  in  the  Monterey  Bay  and  Hawaiian  regions, 
respectively.  These  experiments  demonstrate  the  usefulness  of  the  error  correlation  operator  in  a 
three-dimensional  variational  (3D-Var)  assimilation  algorithm  and  illustrates  the  advantage  of 
using  the  implicit  solution  to  the  diffusion  equation  rather  than  the  explicit.  Section  four 
summarizes  the  presented  work  and  provides  a  brief  discussion  of  the  intended  future  work. 

2.0  THE  BACKGROUND  ERROR  CORRELATION  OPERATOR 

Weaver  and  Courtier  (2001;  hereto  WC2001)  based  their  work  on  that  done  by  Derber 
and  Rosati  (1989)  who  demonstrate  that  an  iterative  Laplacian  grid-point  filter  can  be  used  to 
model  correlations.  One  application  of  this  approach  assumes  that  the  Laplacian  filter  be  viewed 
as  a  time-step  integration  of  a  diffusion  equation  (Egbert  et  al.  1994;  Bennett  et  al.  1997). 
WC2001  build  on  this  approach  to  define  2D  and  3D  univariate  correlation  models  that  are  not 
only  efficient  in  terms  of  computational  speed,  but  also  provide  a  method  for  constructing 
anisotropic  and  inhomogeneous  correlations.  For  a  more  complete  description  of  this  error 
correlation  operator,  the  authors  refer  the  reader  to  WC2001. 

2.1  Correlation  Operator  Using  the  Implicit  Solution  of  a  Diffusion  Equation 

As  noted,  the  correlation  operator  is  built  upon  the  solution  to  the  standard  diffusion 
equation: 

=  V  •  (kVj,).  (1) 
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where  the  diffusion  coefficient  is  a  spatially  varying  quantity  and  therefore  can  be  used  to 
modify  the  length-scale  and  shape  of  the  correlation  based  on  any  predetermined  field;  it  should 
be  noted  that  the  Laplacian  operator  is  three-dimensional.  The  explicit  solution  of  (1)  is  of  the 
form: 

r]n+ 1  =  r]n  +  AtV  •  (/cV?7n).  (2) 

It  can  be  proved  that  in  order  to  maintain  stability  the  time  step  (M)  should  be  set  as  M  > 
2 {Lie)  ,  where  L  is  the  correlation  length  scale  and  e  the  horizontal  grid  resolution.  This 
requirement  in  the  explicit  solution  affects  the  computational  cost  of  modeling  anisotropic  and 
inhomogeneous  correlations.  The  result  is  a  dramatic  increase  in  computational  time  for  the 
explicit  solution  since  M  would  need  to  be  computed  using  the  ratio  of  the  largest  correlation 
length  scale  value  to  the  horizontal  grid  step  (LmdJe).  Due  to  this,  we  introduce  another 
approach  to  solve  for  (1)  using  an  implicit  scheme.  Here,  the  solution  to  (1)  can  be  written  as 

rjn+ 1  =  rjn  +  AtV  •  (/cV?7n+1).  (3) 

The  solution  (3)  is  unconditionally  stable  and  does  not  require  prohibitively  small  time  steps  for 
integration  (Weaver  and  Ricci,  2004).  From  (3)  if  A  =  AtV  •  (kV)  ,  then  equation  (3)  can  be  re¬ 
arranged  to  the  form, 

(7  -  2l)?7n+1  =  7]n,  (4) 

which  can  be  solved  using  a  conjugate  gradient  algorithm.  Using  (4)  for  the  filter  design  reduces 
the  computational  time  required  over  the  explicit  solution.  Using  the  conjugate  gradient  for 
solving  (4)  requires  a  stopping  criterion  for  convergence.  A  series  of  experiments  have  been 
done  to  investigate  the  impact  of  the  convergence  criterion  on  the  solution.  Experiments  with 
stringent  convergence  criterion  (residual  less  than  1.0x1  O'5)  have  displayed  no  substantial  gain 
when  compared  to  looser  criterion.  It  can  be  shown  that  any  value  of  the  residual  between 
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2  5 

1.0x10'  to  1.0x10"  can  provide  adequately  accurate  results  (as  defined  by  a  comparison  to  the 
explicit-solution  operator),  however,  it  should  be  noted  that  a  residual  value  of  1.0x1  O'1  has  been 
found  to  be  inadequate  for  the  purposes  of  this  correlation  operator.  For  the  work  shown  here,  a 
residual  criterion  of  l.OxlO'5  has  been  selected  as  a  balance  between  accuracy  and  efficiency; 
however  a  looser  criterion  could  have  also  been  used  and  would  have  afforded  an  even  greater 
cost  savings  over  the  explicit-solution  operator  than  is  shown  here. 

The  explicit  and  implicit  solutions  are  approximations  of  the  true  solution  to  the  diffusion 
equation.  The  difference  between  the  approximation  and  the  true  solution  is  related  to  the  size  of 
the  time  step  used  to  solve  the  equation  either  implicitly  or  explicitly;  this  difference  is  known  as 
truncation  error.  Since  the  implicit  solution  normally  uses  a  larger  time  step  than  the  explicit 
solution,  these  two  operators  will  provide  slightly  different  results.  This  can  be  mitigated  by 
applying  the  implicit-solution  correlation  operator  numerous  times,  thereby  shortening  the  time 
step  used.  This  does  result  in  an  increase  in  the  computational  cost  of  the  operator,  however  not 
so  much  as  to  eliminate  the  cost  savings  over  the  explicit-solution  operator.  For  the  results 
shown  in  this  work,  the  implicit-solution  operator  has  been  applied  ten  times  in  order  to  closely 
approximate  the  results  of  the  explicit-solution  operator.  Figure  1  compares  the  CPU  time  used 
to  run  the  explicit  solution  (red)  and  implicit  solution  (dashed  blue)  for  a  selected  test  case.  This 
experiment  involves  an  arbitrary  3D  grid  of  157x130x46  with  a  resolution  of  1  km.  A  total  of 
4250  Dirac  impulses,  at  various  horizontal  and  vertical  positions,  are  passed  through  both  filters. 
The  computer  codes  for  these  operators  are  both  currently  serial  versions  and  are  run  on  one 
Opteron  2200  2.8  GHz  processor.  The  grid  resolution  is  fixed  at  1  km,  however  six  different 
horizontal  length  scale  values  are  used:  5  km,  10  km,  20  km,  30  km,  40  km,  and  50  km.  Fig.  1 
shows  that  the  CPU  time  for  the  explicit  solution  rises  rapidly,  whereas  the  CPU  time  for  the 
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implicit  solution  increases  at  a  much  lower,  nearly  linear  rate. 

It  should  be  noted  that  the  algorithm  also  employs  a  set  of  normalization  factors  to  ensure 
that  the  solution  of  the  algorithm  is,  in  fact,  a  correlation  field.  Any  number  of  methods  can  be 
employed  to  calculate  the  nonnalization  factors  such  as  an  explicit  calculation  or  Monte  Carlo 
techniques.  The  nonnalization  factors  would  then  be  applied  to  the  algorithm  as  in  Weaver  and 
Ricci  (2004).  Figure  2  shows  a  side-by-side  comparison  of  an  error  conelation  in  an  arbitrary 
grid  where  a  set  of  normalization  factors  have  been  used.  Fig.  2a  (fig.  2b)  shows  the  conelation 
constructed  using  the  explicit  (implicit)  solution.  In  both  figures  the  horizontal  grid  is  157x130 
with  a  length  scale  of  60  km  and  a  resolution  of  6  km;  no  length  scale  modification  is  used.  The 
implicit  solution  produces  a  correlation  with  nearly  identical  shape,  size,  and  magnitude  as  the 
explicit  solution.  The  difference  field  is  shown  in  figure  2c. 

To  demonstrate  that  these  two  methods  produce  similar  results  when  simulating 
anisotropic  correlations,  an  empirical  length  scale  modification  is  employed  that  accounts  for  the 
changes  in  bathymetry.  It  has  been  suggested  that  correlations  become  horizontally  stretched  in 
the  along  shore  direction  when  near  a  coastline  boundary  (Li  et  al,  2008;  Weaver  and  Ricci, 
2004)  in  shallow  water.  This  anisotropic  feature  can  be  approximated  using  a  quadratic  function 
of  bathymetry, 

Kij.k  —  C(D  —  dij>k)  +  1  (5) 

where  is  the  set  of  spatially  varying  diffusion  coefficients,  D  defines  the  maximum  depth  of 
the  water  column  where  correlation  stretching  will  occur  (model  grid  points  with  depths  greater 
than  D  will  be  isotropic),  and  djj is  the  depth  at  model  grid  point  (i,j,k).  It  should  be  noted  that 
the  minimum  value  of  K  is  never  allowed  to  go  below  1 .0  and  its  maximum  value  is  capped  at 
5.0  by  the  constant  c.  This  constant  is  calculated  as  the  maximum  diffusion  coefficient  value 
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(/cmax)  minus  the  minimum  value  (/cmm)  normalized  by  the  square  of  D.  Figure  3  shows  the  side- 
by-side  comparison  of  an  error  correlation  using  this  length  scale  modification  feature  with  fig. 
3a  (fig.  3b)  using  the  explicit  (implicit)  solution.  The  difference  field  is  shown  in  figure  3c.  In 
this  case,  an  actual  geographical  region  is  needed  with  a  land  mass  within  the  model  domain. 
Here,  a  near-shore  example  around  Monterey  Bay,  California  is  used  with  a  81x58x41  grid  and  1 
km  horizontal  grid  resolution.  Both  correlations  exhibit  obvious  along-shore  stretching  in 
relation  to  the  relatively  shallow  coastal  bathymetry,  shown  in  figure  3d. 

3.0  ASSIMILATION  SCHEME  AND  EXPERIMENTS 

Testing  a  new  error  covariance  scheme  is  an  intensive  and  time  consuming  project.  Such 
validation  normally  requires  many  months  of  trials  involving  numerous  real  data  experiments 
from  a  variety  of  oceanic  conditions  and  regimes.  Conducting  this  sort  of  validation  is  beyond 
the  scope  of  this  study.  However,  two  experiments  are  presented  here:  (1)  a  simulated  and  (2) 
real  data  experiment  to  demonstrate  the  capabilities  of  this  implicit  solution  background  error 
correlation  operator  in  a  data  assimilation  environment. 

3.1  Assimilation  Scheme  and  Forecast  Model 

For  these  experiments  the  three-dimensional  variational  (3D-Var)  analysis  scheme  is 
utilized.  The  analysis  equation  employed  is  the  following 

xa  =  xb  +  BHT(HBHT  +  R)-1(y-Hxb),  (6) 

where  each  variable  follows  the  conventional  definition.  For  simplicity  R  is  taken  as  the 
diagonal  matrix  containing  only  the  observation  variances. 

The  oceanic  forecast  model  is  the  Navy  Coastal  Ocean  Model  (NCOM)  and  is  capable  of 
producing  ocean  forecasts  of  temperature,  salinity,  sea  surface  height,  and  velocity  for  regional 
near-shore  environments  or  for  the  global  oceans  (Martin  2000).  The  model  has  a  free  surface 
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and  is  based  on  the  primitive  equations.  Surface  forcing  conditions  (e.g.  wind  stress,  infrared 
radiation  flux,  etc.)  are  provided  by  the  global  Navy  Operational  Global  Atmospheric  Prediction 
System  (NOGAPS,  Rosmond  et  al.  2002)  with  0.5°  horizontal  grid  resolution.  The  NOGAPS 
forcings  are  archived  every  12  hours  at  the  synoptic  times  of  0000  and  1200  UTC. 

3.2  Monterey  Bay  Simulated  Data  Assimilation 

3.2.1  Experiment  Design 

This  experiment  involves  simulated  data  for  a  Monterey  Bay  simulation.  Here  the 
NCOM  model  has  an  81x57x41  grid  with  a  variable  horizontal  grid  resolution  between  1-4  km. 
For  this  experiment  two  NCOM  forecasts  are  run,  one  during  the  time  frame  of  January,  2007 
and  the  second  for  the  following  month  of  February,  2007.  For  the  sake  of  this  discussion,  the 
January  forecast  will  be  referred  to  as  the  control  (CTRL)  and  the  February  forecast  as  the 
observations  (OBS).  Model  profile  data  of  temperature  and  salinity  are  selected  at  24-hr  intervals 
from  the  OBS  model  run  at  13  locations  (figure  4)  throughout  the  grid.  A  low  number  of  profile 
data  locations  is  selected  to  mimic  the  sparse  distribution  of  real-world  profile  observations. 
Data  from  OBS  are  assimilated  in  a  24-hr  update  cycle  and  the  resulting  analysis  is  used  to  run 
another  NCOM  forecast  for  January,  2007;  OBS  data  from  Feb.  1st  is  assimilated  into  the 
analysis  for  Jan.  1st;  OBS  data  from  Feb.  2nd  is  assimilated  into  the  analysis  for  Jan  2nd,  and  so 
on.  This  forecast  will  be  referred  to  as  the  optimal  forecast  (3DV).  Here,  there  are  two  optimal 
forecasts  performed,  one  using  an  analysis  created  with  the  explicit  solution  correlation  operator 
(3DV-EXP)  and  one  with  the  implicit  solution  correlation  operator  (3DV-IMP).  Both  analyses 
are  evaluated  by  examining  the  difference  fields  between  the  3DV  forecasts  and  the  OBS 
forecast  at  non-assimilated  locations.  And  the  overall  forecast  is  evaluated  using  a  normalized 
error  metric,  which  is  developed  to  evaluate  many  aspects  of  these  experiments,  and  is  a  relative 
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measurement  of  either  the  analysis  or  forecast  error  through  time.  The  metric  eb  is  defined  as 


eb  = 


(7) 


where  k  is  the  observation  index,  xfj  is  the  model  state  (mapped  to  the  observation  space  by  H), 
yk  is  the  observation,  o})hs  is  the  observation  variance.  The  metric  is  computed  as  a  time  series  at 
each  24-hour  time  level  and  each  24-hour  value  is  normalized  by  the  initial  value  at  t=0.  In  order 
to  compare  results,  both  the  3DV-IMP  and  the  3DV-EXP  results  have  been  normalized  by  the 
t=0  values  from  the  3DV-IMP  experiment.  It  should  be  noted  that  the  correlation  length  scale  is 
based  on  the  Rossby  radius  of  deformation  and  is  variable  between  20-30  km. 

3.2.2  Experiment  Results 

Figure  5  shows  a  time-evolution  (six-panel  plot)  of  the  3DV  minus  OBS  (analysis  minus 
observation)  profile  difference  for  temperature  (7)  at  a  non-assimilated  model  grid  point.  The 
closest  selected  assimilated  OBS  point  to  this  profile  location  is  12  km.  The  CTRL  profile  is 
shown  in  red,  the  3DV-IMP  in  green,  and  the  3DV-EXP  in  blue.  The  six  panels  show  the  profile 
for  the  initial  time  (day  1)  then  days  5,  10,  15,  20,  and  25.  The  3DV-EXP  and  3DV-IMP  profiles 
are  remarkably  similar,  as  the  difference  in  correlation  operator  does  not  produce  diverging 
results.  Also,  the  T  difference  is  reduced  significantly  in  the  3DV  results  when  compared  to  the 
CTRL.  The  analysis  seems  to  be  better  at  depth  than  near  the  surface,  but  this  is  likely  due  to  the 
fact  that  the  ocean  is  more  variable  near  the  surface  (due  to  prevailing  surface  forcing 
conditions).  Nevertheless,  the  3DV  results  (both  implicit  and  explicit)  show  improvement  in  the 
near  surface  T  difference  when  compared  to  the  CTRL  results.  Figure  6  shows  this  same  profile 
time  evolution  for  salinity  ( S ).  The  results  shown  for  this  profile  are  representative  of  the  entire 
model  solution  (all  profiles). 
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Figure  7  shows  the  normalized  error  metric  for  the  full  28-day  3DV  forecast.  The  T  error 
is  shown  in  red,  S  in  blue,  and  total  velocity  ( V)  i n  green.  The  3DV-IMP  (3DV-EXP)  results  are 
shown  as  the  solid  lines  (dashed  lines).  It  is  clear  that  as  the  24-hr  update  assimilation  cycle 
continues  through  the  28-day  forecast,  the  errors  for  all  fields  decrease  dramatically.  The  largest 
decrease  occurs  in  the  first  ten  days  as  the  errors  in  most  fields  drop  from  1.0  to  0.3-0. 4.  Also,  it 
is  worth  noting  that  the  3DV-IMP  and  3DV-EXP  results  are  nearly  identical. 

The  timing  resuls  for  this  experiment  indicate  the  improved  computational  efficiency  of 
the  implicit-solution  correlation  operator  over  the  explicit  version.  For  the  experiment  utilizing 
the  explicit-solution  correlation  operator,  the  full  28-day  analysis-forecast  cycle  ran  in  128 
minutes  using  a  single-processor  Opteron  2200  2.8  GHz  computer,  whereas  the  version  using  the 
implicit-solution  correlation  operator  took  just  under  an  hour  at  42  minutes,  a  savings  ratio  of 
roughly  3:1. 

3.3  RIMPAC  Real  Data  Assimilation 

3. 3. 1  Experiment  Design 

The  real  data  assimilation  experiment  concerns  a  geographical  region  surrounding  the 
Hawaiian  island  chain  during  a  15-day  period  from  16  June  to  30  June,  2008.  The  NCOM  grid 
used  for  this  experiment  is  157x130x46  with  a  6  km  grid  resolution  with  boundary  conditions 
supplied  from  the  operational  global  NCOM  run.  Observations  are  selected  from  a  portion  of  the 
Navy’s  RIMPAC  (Rim  of  the  Pacific)  exercise  with  a  24-hr  update  cycle  used  to  assimilate  the 
observations  at  0000  UTC  each  day;  assimilated  data  are  collected  from  a  +/-  12-hour  window 
around  the  analysis  time.  This  3D-Var  routine  is  linked  directly  to  the  observation  preparatory 
and  quality  control  program  suite  from  the  Navy  Coupled  Ocean  Data  Assimilation  (NCODA) 
system,  known  as  NCODA-prep  (Cummings,  2005).  This  was  done  to  ensure  that  the  included 
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observations  were  of  operational  quality  and  to  take  advantage  of  the  well-established  data 
selection  and  quality  control  routines  already  included  in  the  NCODA  system. 

The  observational  data  used  for  this  experiment  include  many  sources  such  as  oceanic 
gliders  (Rudnick  et  al.  2004),  ARGO  float  profiles  (Roemmich  et  al.  2001),  and  Modular  Ocean 
Data  Assimilation  System  synthetics  profiles  (MODAS,  Fox  et  al.  2002),  however  only  three 
variables  are  considered  in  the  analysis:  T,  S  (surface  and  sub-surface),  and  SSH.  The 
background  error  variances  are  calculated  as  in  Cummings  (2005).  As  in  the  simulated  data 
experiment,  two  assimilation  model  runs  are  conducted  here.  One  uses  a  3D-Var  with  the 
implicit  correlation  operator  (IMPL)  and  the  other  with  the  explicit  operator  (EXPL). 

3.3.2  Experiment  Results 

To  investigate  the  analysis,  four  observational  profiles  of  T  and  S  have  been  excluded 
from  each  of  the  assimilation  cycles  to  be  used  as  independent  observations  for  validation 
purposes;  results  from  one  profile  location  are  shown  here.  The  location  of  this  profile  is  shown 
relative  to  other  observation  data  in  figure  8  (red  cross)  for  16  June,  2008.  It  should  be  noted 
that  this  figure  includes  the  location  of  surface-only  (black  dots)  and  profile  observation 
locations  (red  dots).  The  observation  profiles  are  compared  to  profiles  from  three  model  runs: 
(1)  the  first  guess  (FG)  field,  (2)  a  3D-Var  analysis  using  the  implicit  solution  correlation 
operator  (IMPL)  and  (3)  a  3D-Var  analysis  using  the  explicit  solution  correlation  operator 
(EXPL).  As  in  the  simulated  data  experiment,  the  Rossby  radius  of  deformation  is  used  here  to 
define  the  correlation  length  scale,  resulting  in  correlation  length  scales  ranging  from  55-70  km. 

Figure  9  shows  a  comparison  between  three  model  profiles  (from  FG,  IMPL,  and  EXPL) 
and  one  non-assimilated  observation  profile  at  0000  UTC  16  June,  2008.  Absolute  differences 
are  calculated  between  the  observation  value  and  FG  (red),  IMPL  (green),  and  EXPL  (blue)  for  T 
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(fig.  9a)  and  S  (fig.  9b).  The  FG  field  features  a  large  discrepancy  with  the  observation  value  for 
both  T  and  S  for  just  below  the  surface  to  around  500  m.  This  region  has  been  identified  as  the 
mixed  layer  (figure  omitted),  and  suggests  that  NCOM  is  having  some  difficulty  correctly 
simulating  the  thermodynamic  properties  of  this  layer.  This  discrepancy  is  reduced  significantly 
in  the  IMPL  and  EXPL  results,  even  though  the  nearest  assimilated  profile  observation  in  this 
analysis  is  at  42  km  distance. 

The  IMPL  and  EXPL  difference  profiles  are  very  similar,  especially  in  the  S  values.  The 
T  difference  profiles  show  some  dissimilarities,  however  these  are  small,  on  order  of  about  0.05 
degrees  Celsius.  These  differences,  however,  have  little  impact  on  the  overall  structure  of  the 
analysis  and  on  the  performance  of  the  resulting  forecast  (to  be  shown  later).  It  is  important  to 
note  that  the  results  from  the  other  withheld  profiles,  for  all  time  levels,  are  similar  to  the 
example  shown  here. 

A  brief  assessment  of  the  real  data  forecast,  using  the  3D-Var  analysis,  is  shown  in  fig. 
10  using  error  metric  (7).  The  T  error  is  in  red,  S  error  in  blue,  SSI  I  error  in  green,  and  the  IMPL 
(EXPL)  results  are  shown  as  solid  (dashed)  lines.  Clearly  the  errors  in  all  fields  are  nearly 
identical  between  the  IMPL  and  EXPL  model  runs.  The  error  generally  decreases  as  the  forecast 
progresses,  indicating  that  the  assimilation  of  observations  is  having  a  positive  impact  on  the 
analysis  and  the  resulting  forecast.  There  appears  to  be  little  statistical  difference  between  the 
two  forecasts,  save  for  the  amount  of  CPU  time  required  to  run  the  two  systems.  For  this  real- 
data  experiment  in  the  RIMPAC  region  the  3D-Var  using  the  IMPL  out-performs  the  EXPL 
nearly  5:1,  at  4  hours  to  the  EXPL’s  18  hours  (using  a  single  processor  Opteron  2200  2.8  GHz 
computer).  Clearly,  this  represents  a  significant  savings  in  terms  of  computational  time  and 
resources. 
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4.0  SUMMARY  AND  FUTURE  WORK 


An  alternative  method  of  error  correlation  modeling  based  on  a  diffusion  equation  has 
been  presented.  It  is  known  that  a  correlation  operator  relying  on  the  explicit  solution  to  a 
diffusion  equation  is  conditionally  stable,  thereby  limiting  the  size  of  the  time  step  in  order  to 
maintain  stability.  This  is  especially  true  of  modeling  anisotropic  correlations  where  the 
correlation  length  scale  changes  spatially  over  a  model  grid  with  fixed  resolution.  In  this  case, 
the  explicit  solution  correlation  operator  requires  a  large  number  of  small  time  steps  in  order  to 
maintain  stability.  On  the  other  hand,  the  implicit  solution  correlation  operator  is 
unconditionally  stable  and  does  not  require  a  drastic  increase  in  computational  time  to  handle 
any  correlation  shape  that  may  be  desired. 

Comparisons  between  the  explicit  and  implicit  solution  correlation  operators  demonstrate 
that  the  implicit  solution  produces  correlations  very  similar  to  the  explicit  solution  in  terms  of 
magnitude,  shape,  and  spatial  size.  The  performance  of  the  operators  are  shown  in  terms  of  both 
simulated  and  real  data  assimilation  experiments.  In  the  real  data  experiment,  observations 
collected  by  the  NCODA  preparation  subroutines  are  used  in  a  15  day  forecast,  in  which  a  24- 
hour  forecast  cycle  assimilates  data  daily  at  0000  UTC.  Here  observational  profiles  of 
temperature  and  salinity  are  withheld  at  four  locations  in  each  assimilation  cycle;  these  profiles 
are  then  used  to  evaluate  the  performance  of  the  assimilation  scheme  and  the  correlation 
operators.  The  results  suggest  that  the  correlation  operators  perform  well  in  spreading 
information  from  the  observations  throughout  the  model  grid,  and  to  locations  where  no  data  are 
assimilated.  Also,  the  results  from  the  RIMPAC  real-data  experiment  show  that  the  implicit  and 
explicit  solution  operators  provide  nearly  identical  results  in  the  3D-Var  system;  however,  the 
3D-Var  using  the  implicit  solution  correlation  operator  ran  in  approximately  1/5  the  time  of  the 
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system  employing  the  explicit  operator. 

The  next  step  in  this  research  is  to  perform  a  more  robust  validation  of  this  implicit 
covariance  operator.  This  would  involve  testing  the  performance  of  the  operator  in  real  data 
assimilation  experiments  involving  numerous  environmental  regimes  (i.e.  deep  water,  near  shore, 
strong  upwelling/downwelling,  presence  of  fronts,  etc.)  to  assess  how  well  the  system  adapts  to 
changing  conditions. 
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7.0  FIGURES 


For  fixed  grid  spacing  at  I.OxlO3  (m) 


Length  Scale  x  104(m) 


Figure  1:  CPU  time  to  run  diffusion-based  correlation  operator  for  a  fixed  grid  resolution  and  varying  correlation 
length  scales.  Implicit  solver  in  dashed  blue,  explicit  in  solid  red;  y-axis  is  CPU  time  (in  seconds)  and  x-axis  is 
correlation  length  scale  (in  104  m). 
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Figure  2:  Example  correlation  function  centered  at  78x65  on  arbitrary  grid;  center  value  is  1.0  for  (a)  the  explicit 
solver  and  (b)  the  implicit  solver;  (c)  displays  the  difference  field  explicit  minus  implicit.  Grid  resolution  and 
correlation  length  scale  are  fixed  at  6.0xl03  m  and  6.0xl04  m,  respectively. 
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Correlation  Near  Shore 


b) 


Correlation  Near  Shore 


a) 


Longitude  (*W) 


Longitude  ( W) 


Figure  3:  Anisotropic  correlation  function  (with  length  scale  modification)  near  the  coast  of  Monterey  Bay, 
California,  U.S.  A.  using  the  (a)  explicit  and  (b)  implicit  solver;  (c)  shows  the  difference  field  explicit  minus 
implicit;  (d)  shows  the  bathymetry. 
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SST  (in  C)  F9bmary  1 , 2007  (0000  UTC)  *C 


Figure  4:  13  T/S  profile  locations  selected  in  NCOM  domain  for  simulated  ocean  data  assimilation  experiment. 
Field  shown  is  SST  (in  °C)  for  1  February,  2007  at  0000  UTC. 
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Temperature  Departure  from  OBS  (in  Deq  C) 


Temperature  Departure  from  OBS  (in  Deg  C) 


Temperature  Departure  from  OBS  (in  Deq  C) 


Temperature  Departure  from  OBS  (in  Deq  C) 


Figure  5:  CTRL  minus  OBS  (solid  red),  3DV-IMP  minus  OBS  (dash  green),  and  3DV-EXP  minus  OBS  (dash  blue) 
temperature  difference  at  the  initial  time  and  at  days  5,  10,  15,  and  25  for  an  NCOM  grid  point  within  12  km  of  a 
nearby  observation  location. 
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NCOM  Depth  (m)  NCOM  Depth  (m) 


Salinity  Departure  from  OBS  (in  psu) 


Figure  6:  Same  as  in  fig.  5,  but  for  salinity. 
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Normalized  Error  in  Forecast  (Initialized  using  3D-Var  Analysis) 


Figure  7:  Relative  forecast  error  in  T  (red),  S  (blue),  and  total  v  (green)  for  3DV-IMP  (solid)  and  3DV-EXP 
(dashed).  Relative  error  is  normalized  by  error  in  each  field  at  day  1 .  Error  is  calculated  as  in  (8). 
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Observation  Locations  (16  June  2008)  Over  3D-Var  SST-Increment  Field 
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Figure  8:  All  temperature  observation  locations  for  16  June,  2008  including  surface  observations  (black  dots)  and 
profile  observations  (red  dots)  overlaid  on  3D-Var  surface  temperature  increments.  Location  of  excluded  profile 
used  as  independent  observation  indicated  by  red  cross. 


a)  b) 


Temperature  (K)  Salinity  (psu) 

Figure  9:  Absolute  difference  between  observation  profile  and  first  guess  (FG)  profile  (red),  analysis  that  used  the 
implicit  solution  correlation  operator  (green),  and  analysis  that  used  the  explicit  operator  (blue).  Left  (a)  is  for 
temperature,  panel  (b)  is  for  salinity.  Comparison  is  valid  at  0000  UTC  16  June,  2008. 
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Implicit  vs.  Explicit 


Figure  10:  Relative  forecast  error  in  T  (red),  S  (blue)  and  SSH  (green)  for  IMPL  model  run  (solid)  and  EXPL  model 
run  (dashed).  Relative  error  is  normalized  by  error  in  each  field  at  day  1.  Error  is  calculated  as  in  (8). 
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